biplotEZ

User-friendly biplots in R



Centre for Multi-Dimensional Data Visualisation (MuViSU)
muvisu@sun.ac.za



SASA 2024

Introduction



The biplotEZ package aims to provide users with EZier software to construct biplots.



What is a biplot?

Visualisation of multi-dimensional data in 2 or 3 dimensions.



A brief history of biplots and biplotEZ.

History (1)

1971

Gabriel, K.R., The biplot graphic display of matrices with application to principal component analysis. Biometrika, 58(3), pp.453-467.


1976

Prof Niël J le Roux presents a seminar on biplots.

Photo NJ_hist






History (2)

1996

John Gower publish Biplots with David Hand.

Photo GH_hist

Prof le Roux introduces a Masters module on Biplots (Multidimensional scaling).

Rika Cilliers obtains her Masters on biplots for socio-economic progress under Prof le Roux.


1997

SASA conference paper: S-PLUS FUNCTIONS FOR INTERACTIVE LINEAR AND NON-LINEAR BIPLOTS by SP van Blerk, NJ le Roux & S Gardner.

History (3)

2001

Sugnet Gardner (Lubbe) obtains her PhD on biplots under Prof le Roux.

Photo SL_hist

Louise Wood obtains her Masters on biplots for socio-economic development under Prof le Roux.


2003: Adele Bothma obtains her Masters on biplots for school results under Prof le Roux.

2007: Idele Walters obtains her Masters on biplots for exploring the gender gap under Prof le Roux.

2008: Ryan Wedlake obtains his Masters on robust biplots under Prof le Roux.




History (4)

2009

BiplotGUI for Interactive Biplots, Anthony le Grange.

2010: André Mostert obtains his Masters on biplots in industry under Prof le Roux.

2011

John Gower, Sugnet Lubbe and Niël le Roux publish Understanding Biplots.

Photo UB_hist

R package UBbipl developed with the book, but never published.

History (5)

2013: Hilmarie Brand obtains her Masters on PCA and CVA biplots under Prof le Roux.

2014: Opeoluwe Oyedele obtains her PhD on Partial Least Squares biplots under Sugnet Lubbe.

2015: Ruan Rossouw obtains his PhD on using biplots for multivariate process monitoring under Prof le Roux.

2016: Ben Gurr obtains his Masters on biplots for crime data under Prof le Roux.


2019

Johané Nienkemper-Swanepoel obtains her PhD on MCA biplots under Prof le Roux and Sugnet Lubbe.

Photo JN_hist

Carel van der Merwe obtains his PhD using biplots. Carel supervises 4 Master’s projects on biplots.

  • Justin Perrang, Francesca van Niekerk, David Rodwell, Delia Sandilands


History (6)

2020

Raeesa Ganey obtains her PhD on Principal Surface Biplots under Sugnet Lubbe.

Photo RG_hist

André Mostert obtains his PhD on multidimensional scaling for identification of contributions to out of control multivariate processes under Sugnet Lubbe.

Adriaan Rowen obtains his Masters using biplots to understand black-box machine learning models.

2022

Zoë-Mae Adams obtains her Masters on biplots in sentiment classification under Johané Nienkemper-Swanepoel.

Photo ZA_hist

History (7)

2023

bipl5 for Exploding Biplots, Ruan Buys.

2024

Ruan Buys obtains his Masters on Exploding biplots under Carel van der Merwe.

Photo RB_hist

Adriaan Rowen to submit his PhD using biplots to understand black-box machine learning models.

Peter Manefeldt to submit his Masters using multidimensional scaling for interpretability of random forest models.

Photo PM_hist

The Team

Photo 1
Photo 2
Photo 3
Photo 4
Photo 5
Photo 6
Photo 7

What are biplots?

  • The biplot is a powerful and very useful data visualisation tool.

  • Biplots make information in a table of data become transparent, revealing the main structures in the data in a methodical way, for example patterns of correlations between variables or similarities between the observations.

  • A biplot is a generalisation of a two-dimensional scatter diagram of data that exists in a higher dimensional space, where information on both samples and variables can be displayed graphically.

  • There are different types of biplots that are based on various multivariate data analysus techniques.

  • The simplest type of biplot is based on Principal Component Analysis.

Theory behind constructing a PCA biplot

Data: \({\bf{X}}\)

#          X1       X2        X3
# 1  5.418840 5.054240  8.711160
# 2  3.129920 1.783160  3.385920
# 3  6.128080 2.173200  8.173560
# 4  6.781120 4.753280  8.731640
# 5  7.346560 5.893200 11.303040
# 6  7.208200 3.744000 10.075760
# 7  7.039440 5.213640  8.608840
# 8  5.465720 4.492640  5.596520
# 9  7.723240 4.708120 11.357480
# 10 7.109560 4.987520  7.732840
# 11 8.135800 4.392000  9.264840
# 12 6.287480 5.908720  7.488240
# 13 4.648880 7.198280  8.573720
# 14 5.798600 6.120080  8.254840
# 15 8.084560 3.234840  8.966560
# 16 6.157773 5.743455  9.899045
# 17 3.556727 2.026318  3.847636
# 18 6.963727 2.469545  9.288136
# 19 7.705818 5.401455  9.922318
# 20 8.348364 6.696818 12.844364
# 21 8.191136 4.254545 11.449727
# 22 7.999364 5.924591  9.782773
# 23 6.211045 5.105273  6.359682
# 24 8.776409 5.350136 12.906227
# 25 8.079045 5.667636  8.787318

Theory behind constructing a PCA biplot

Geometrically the rows of \({\bf{X}}\) are given as coordinates of \(n\) samples in the \(p\)-dimensional space \(\mathbb{R}^p\).

The aim is to seek an \(r\)-dimensional plane that contains the points whose coordinates are given by the rows of \({\bf{\hat{X}}}_{[r]}\) which minimises a least squares criterion given by, \[\begin{equation} || {\bf{X}} - {\bf{\hat{X}}}_{[r]}||^2 = tr\{({\bf{X}} - {\bf{\hat{X}}}_{[r]})({\bf{X}} - {\bf{\hat{X}}}_{[r]})'\}. \end{equation}\]

The best approximation that minimises the least squares criterion is the \(r\)-dimensional Eckart-Young approximation given by \({\bf{\hat{X}}}_{[r]} = {\bf{U}} {\bf{D}}_{[r]} {\bf{V}}'\)

Representing samples

A standard result when \(r=2\) from is that the row vectors of \({\bf{\hat{X}}}_{[2]}\) are the orthogonal projections of the corresponding row vectors of \({\bf{X}}\) onto the column space of \({\bf{V}}_2\). The projections are therefore,

\[\begin{equation} {\bf{X}} {\bf{V}}_2. \end{equation}\] These projections are also known as the first two principal components.

Representing variables

The columns of \({\bf{X}}\) are approximated by the first two rows of \({\bf{V}}\), which now represent the axes for each variable.

Calibrated biplot axes

We have constructed a biplot, but the variables represented by the vectors (arrows) have no calibration.

That meaning, there are no markers on the vectors representing the variables analogous to ordinary scatterplots.

To construct a biplot axis with relevant markers for a variable, a \((p-1)\)-dimensional hyperplane \(\mathscr{N}\) perpendicular to the Cartesian axis is required.

First plane

From the data, \(p = 3\) therefore, a two-dimensional hyperplane is constructed perpendicular to \(X_1\) through a specific value of \(X_1\), say \(\mu\).

The intersection of \(\mathscr{L}\) and \(\mathscr{N}\) is an \((r-1)\)-dimensional intersection space, which in this case will be indicated by a line. All the points on this intersection line in \(\mathscr{L}\) will predict the value for \(\mu\) for the \(X_1\)-axis.

Second plane

The plane \(\mathscr{N}\) is shifted orthogonally through another value on \(X_1\) and all the points on the intersection line of \(\mathscr{L}\) and \(\mathscr{N}\) will predict that value that the plane goes through.

Intersection lines

As the plane \(\mathscr{N}\) is shifted along the \(X_1\)-axis, a series of parallel intersection spaces is obtained.

Any line passing through the origin will pass through these intersection spaces and can be used as an axis fitted with markers according to the value associated with the particular intersection space.

To facilitate orthogonal projection onto the axis, similar to an ordinary scatterplot, the line orthogonal to these intersection spaces is chosen.

PCA biplot

With variable 1

PCA biplot

With all variables

Flow of functions in biplotEZ

Main Function

biplot()

Type of Biplot

PCA()
CVA()
PCO()
CA()

Aesthetics

samples()
axes() newsamples()
newaxes()

Operations

prediction()
interpolate()
translate()
density()
fit.measures()
classify() alpha.bags()
ellipses()
rotate()
reflect()
zoom()
regress()
splines()

Plotting

plot()

First step to create a biplot

biplot()

First step to create a biplot

biplot(data = iris,
       )

First step to create a biplot

biplot(data = iris,
       group.aes = iris[,5], 
       Title = "My first biplot"
       )
# Object of class biplot, based on 150 samples and 5 variables.
# 4 numeric variables.
# 1 categorical variable.
Argument Description
data a dataframe or matrix containing all variables the user wants to analyse.
classes a vector identifying class membership. Required for CVA biplots
group.aes Variable from the data to be used as a grouping variable.
center a logical value indicating whether data should be column centered, with default TRUE.
scaled a logical value indicating whether data should be standardised to unit column variances, with default FALSE.
Title Title of the biplot to be rendered.

PCA function

PCA()
Argument Description
bp Object of class biplot.
dim.biplot Dimension of the biplot. Only values 1, 2 and 3 are accepted, with default 2.
e.vects Which eigenvectors (principal components) to extract, with default 1:dim.biplot.
group.aes If not specified in biplot()
show.class.means TRUE or FALSE: Indicating whether group means should be plotted in the biplot, with default FALSE.
correlation.biplot TRUE or FALSE: Indicating whether distances or correlations between the variables are optimally approximated, with defautl FALSE.

Data

tibble(iris)
# # A tibble: 150 × 5
#    Sepal.Length Sepal.Width Petal.Length
#           <dbl>       <dbl>        <dbl>
#  1          5.1         3.5          1.4
#  2          4.9         3            1.4
#  3          4.7         3.2          1.3
#  4          4.6         3.1          1.5
#  5          5           3.6          1.4
#  6          5.4         3.9          1.7
#  7          4.6         3.4          1.4
#  8          5           3.4          1.5
#  9          4.4         2.9          1.4
# 10          4.9         3.1          1.5
# # ℹ 140 more rows
# # ℹ 2 more variables: Petal.Width <dbl>,
# #   Species <fct>

PCA biplot

biplot(data = iris, group.aes = iris[,5],
       Title = "My first biplot") |>

PCA biplot

biplot(data = iris, group.aes = iris[,5],
       Title = "My first biplot") |> 
  PCA() |> 
  plot()

Flow of functions in biplotEZ

Main Function

biplot()

Type of Biplot

PCA()
CVA()
PCO()
CA()

Aesthetics

samples()
axes() newsamples()
newaxes()

Operations

prediction()
interpolate()
translate()
density()
fit.measures()
classify() alpha.bags()
ellipses()
rotate()
reflect()
zoom()
regress()
splines()

Plotting

plot()

Aesthetics: samples()

Change the colour, plotting character and character expansion of the samples.

  samples(col = c("orange","purple","gold"), pch = c(15,1,17), cex = 1.2, 
          opacity = 0.6) |> 

Aesthetics: samples()

Change the colour, plotting character and character expansion of the samples.

biplot(iris, group.aes = iris[,5]) |> 
  PCA() |> 
  samples(col = c("orange","purple","gold"), pch = c(15,1,17), cex = 1.2, 
          opacity = 0.6) |> 
  plot()

Aesthetics: samples()

Select certain groups, and add labels to the samples

biplot(iris, group.aes = iris[,5]) |> 
  PCA() |> 
  samples(which = c(1,2), col = c("orange","purple"), label = TRUE) |> 
  plot()

Aesthetics: samples()

Other arguments

Argument Description
label.col Colour of labels
label.cex Text expansion of the labels
label.side Side at which the label of the plotted point appears - “bottom” (default), “top”, “left”, “right”
label.offset Offset of the label from the plotted point
connected TRUE or FALSE: whether samples are connected, with default FALSE
connect.col Colour of the connecting line
connect.lty Line type of the connecting line
connect.lwd Line width of the connecting line

Aesthetics: axes()

Change the colour and line width of the axes

biplot(iris[,1:4]) |> PCA() |> 
  samples(col = "grey", opacity = 0.5) |>

Aesthetics: axes()

Change the colour and line width of the axes

biplot(iris[,1:4]) |> PCA() |> 
  samples(col = "grey", opacity = 0.5) |>
  axes(col = "rosybrown", label.dir = "Orthog", lwd = 2) |> 
  plot()

Aesthetics: axes()

Show the first two axes with vector representation and unit circle

biplot(iris[,1:4]) |> PCA() |> 
  samples(col = "grey", opacity = 0.5) |>
  axes(which = 1:2, col = "rosybrown", vectors = TRUE, unit.circle = TRUE) |> 
  plot()

Aesthetics: axes()

Other arguments

Axis labels
ax.names
label.dir
label.col
label.cex
label.line
label.offset

Ticks
ticks
tick.size
tick.label
tick.label.side
tick.label.col
Prediction
predict.col
predict.lwd
predict.lty

Orthogonal
orthogx
orthogy

Flow of functions in biplotEZ

Main Function

biplot()

Type of Biplot

PCA()
CVA()
PCO()
CA()

Aesthetics

samples()
axes() newsamples()
newaxes()

Operations

prediction()
interpolate()
translate()
density()
fit.measures()
classify() alpha.bags()
ellipses()
rotate()
reflect()
zoom()
regress()
splines()

Plotting

plot()

Prediction of samples

prediction()

out <- biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> 
  samples(col = c("orange","purple","gold"), opacity = 0.5) |>

Prediction of samples

prediction()

out <- biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> 
  samples(col = c("orange","purple","gold"), opacity = 0.5) |>
  prediction(predict.samples = c(1:2,51:52,101:102) )|>

Prediction of samples

prediction()

out <- biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> 
  samples(col = c("orange","purple","gold"), opacity = 0.5) |>
  prediction(predict.samples = c(1:2,51:52,101:102) )|>
  axes(predict.col = "red", predict.lwd = 1.5, predict.lty = 2) |> plot()

Prediction of samples

Predict only on the variable Sepal.Length: use the which argument.

biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> 
  samples(col = c("orange","purple","gold"), opacity = 0.5) |>
  prediction(predict.samples = c(1:2,51:52,101:102), which = "Sepal.Length")|>
  axes(predict.col = "red", predict.lwd = 1.5, predict.lty = 2) |> plot()

Prediction of group means

biplot(iris[,1:4], group.aes = iris[,5]) |> PCA(show.class.means = TRUE) |> 
  samples(col = c("orange","purple","gold"), opacity = 0.5) |>
  prediction(predict.means = TRUE) |>
  axes(predict.col = "red", predict.lwd = 1.5, predict.lty = 2) |> plot()

Predictions

summary(out)
# Object of class biplot, based on 150 samples and 4 variables.
# 4 numeric variables.
# 
# Sample predictions
#     Sepal.Length Sepal.Width Petal.Length Petal.Width
# 1       5.083039    3.517414     1.403214   0.2135317
# 2       4.746262    3.157500     1.463562   0.2402459
# 51      6.757521    3.449014     4.739884   1.6079559
# 52      6.389336    3.210952     4.501645   1.5094058
# 101     6.751606    2.836199     5.928106   2.1069758
# 102     5.977297    2.517932     5.070066   1.7497923

Interpolation of samples

biplot(iris[1:100,]) |> PCA() |> 

Interpolation of samples

biplot(iris[1:100,]) |> PCA() |> 
  interpolate (newdata = iris[101:150,]) |> 

Interpolation of samples

biplot(iris[1:100,]) |> PCA() |> 
  interpolate (newdata = iris[101:150,]) |> 
  newsamples(col = "red") |> plot()

Interpolation of axes

biplot(iris[,1:3]) |> PCA() |> 
    interpolate(newdata = NULL, newvariable = iris[,4]) |> 
    newaxes(X.new.names = "Petal.Width") |> plot()

Translation

Automatically or manually translate the axes away from the center of the plot.

biplot(iris)|> 
      PCA(group.aes = iris[,5]) |> 
      translate_axes(swop = TRUE, delta = 0.2)|> plot(exp.factor = 3)

Density plots

On the first group

biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> 

Density plots

On the first group

biplot(iris[,1:4],group.aes = iris[,5]) |> PCA() |> 
  density2D(which = 1, col = c("white","purple","cyan","blue")) |> plot()

Density plots

On the second group, and adding contours

biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> 
  density2D(which = 2, col = c("white","purple","cyan","blue"),
            contours = TRUE) |> plot()

Density plots

On the third group, and changing the colour of the contours.

biplot(iris[,1:4],group.aes = iris[,5]) |> PCA() |> 
  density2D(which = 3, col = c("white","purple","cyan","blue"),contours = TRUE,
            contour.col = "grey") |> plot()

Fit measures

out2 <- biplot(iris[,1:4],group.aes = iris[,5]) |> PCA() |> fit.measures()
summary(out2)
# Object of class biplot, based on 150 samples and 4 variables.
# 4 numeric variables.
# 
# Quality of fit in 2 dimension(s) = 97.8% 
# Adequacy of variables in 2 dimension(s):
# Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
#    0.5617091    0.5402798    0.7639426    0.1340685 
# Axis predictivity in 2 dimension(s):
# Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
#    0.9579017    0.8400028    0.9980931    0.9365937 
# Sample predictivity in 2 dimension(s):
#         1         2         3         4         5         6         7         8 
# 0.9998927 0.9927400 0.9999141 0.9991226 0.9984312 0.9949770 0.9914313 0.9996346 
#         9        10        11        12        13        14        15        16 
# 0.9998677 0.9941340 0.9991205 0.9949153 0.9945491 0.9996034 0.9942676 0.9897890 
#        17        18        19        20        21        22        23        24 
# 0.9937752 0.9990534 0.9972926 0.9928624 0.9896250 0.9932656 0.9918132 0.9955885 
#        25        26        27        28        29        30        31        32 
# 0.9812917 0.9897303 0.9979903 0.9990514 0.9963870 0.9975607 0.9985741 0.9876345 
#        33        34        35        36        37        38        39        40 
# 0.9833383 0.9957412 0.9970200 0.9935405 0.9859750 0.9953399 0.9994047 0.9990244 
#        41        42        43        44        45        46        47        48 
# 0.9980903 0.9756895 0.9953372 0.9830035 0.9763861 0.9959863 0.9905695 0.9987006 
#        49        50        51        52        53        54        55        56 
# 0.9996383 0.9987482 0.9275369 0.9996655 0.9544488 0.9460515 0.9172857 0.9061058 
#        57        58        59        60        61        62        63        64 
# 0.9727694 0.9996996 0.8677939 0.8686502 0.9613130 0.9328852 0.4345132 0.9679973 
#        65        66        67        68        69        70        71        72 
# 0.7995848 0.9083037 0.7968614 0.5835260 0.7900027 0.8575646 0.8524748 0.6615410 
#        73        74        75        76        77        78        79        80 
# 0.9367709 0.8661203 0.8350955 0.8929908 0.8702600 0.9873164 0.9969031 0.6815512 
#        81        82        83        84        85        86        87        88 
# 0.8937189 0.8409681 0.7829405 0.9848354 0.6901625 0.8073582 0.9666041 0.6665514 
#        89        90        91        92        93        94        95        96 
# 0.6993846 0.9909923 0.9008345 0.9710941 0.8037223 0.9913632 0.9744493 0.7089660 
#        97        98        99       100       101       102       103       104 
# 0.9071738 0.9064541 0.9625371 0.9872279 0.9171603 0.9636413 0.9976224 0.9829885 
#       105       106       107       108       109       110       111       112 
# 0.9854704 0.9888092 0.8464463 0.9729353 0.9771293 0.9794313 0.9746239 0.9977302 
#       113       114       115       116       117       118       119       120 
# 0.9941859 0.9605563 0.8476794 0.9289985 0.9929982 0.9916850 0.9818957 0.9493751 
#       121       122       123       124       125       126       127       128 
# 0.9865358 0.8716778 0.9728177 0.9846364 0.9840890 0.9861783 0.9854516 0.9691512 
#       129       130       131       132       133       134       135       136 
# 0.9942007 0.9585884 0.9705389 0.9937852 0.9874192 0.9723192 0.9230503 0.9794405 
#       137       138       139       140       141       142       143       144 
# 0.8947527 0.9797055 0.9458421 0.9902488 0.9674660 0.9350646 0.9636413 0.9867931 
#       145       146       147       148       149       150 
# 0.9500265 0.9470544 0.9688318 0.9886543 0.8735433 0.9281727

What is Correspondence Analysis?

  • Aims to expose the association between two categorical variables.

  • Categorical variables measure characteristics of individuals (samples) in the form of finite discrete response levels (category levels).

  • Summarised in a two-way contingency table.

  • Focus placed on nominal categorical variables - category levels with no specific rank / order.

  • Numerous variants of CA are available for the application to diverse problems, the interested reader is referred to references in the vignettes.

  • With biplotEZ, focus is placed on three EZ-to-use variants (more information to follow).

Correspondence Analysis

  • Data matrix in CA() is different from PCA() and CVA().

  • \(\mathbf{X}:r\times c\), represents the cross-tabulations of two categorical variables.

  • The elements of the data matrix represent the frequency of the co-occurrence of two particular levels of the two variables.

Consider the HairEyeColor data set in R, which summarises the hair and eye colour of male and female statistics students. For the purpose of this example only the male students will be considered:

cross_tab <- HairEyeColor[,,2]
cross_tab
#        Eye
# Hair    Brown Blue Hazel Green
#   Black    36    9     5     2
#   Brown    66   34    29    14
#   Red      16    7     7     7
#   Blond     4   64     5     8

Current functions for CA() in biplotEZ

  • biplot() |>
    • CA() |>
    • interpolate() |> fit.measures() |>
    • samples() |> newsamples() |>
    • legend.type() |>
  • plot()

Take note of the warning message:

biplot(HairEyeColor[,,2], center = TRUE) |> CA()
# Warning in CA.biplot(biplot(HairEyeColor[, , 2], center = TRUE)): Centering was
# not performed. Set biplot(center = FALSE) when performing CA().
# Object of class CA, based on 4 row levels and 4 column levels.

CA calculations

  • It is typical to express the frequencies in terms of proportions / probabilities.

  • Consider the correspondence matrix \(\mathbf{P}\):

P_mat <- cross_tab/sum(cross_tab)
P_mat
#        Eye
# Hair          Brown        Blue       Hazel       Green
#   Black 0.115015974 0.028753994 0.015974441 0.006389776
#   Brown 0.210862620 0.108626198 0.092651757 0.044728435
#   Red   0.051118211 0.022364217 0.022364217 0.022364217
#   Blond 0.012779553 0.204472843 0.015974441 0.025559105

CA calculations

  • Row masses (diagonal matrix) - \(\mathbf{D_r}\):
ca.out <- biplot(HairEyeColor[,,2], center = FALSE) |> CA()
ca.out$Dr
#           [,1]     [,2]      [,3]      [,4]
# [1,] 0.1661342 0.000000 0.0000000 0.0000000
# [2,] 0.0000000 0.456869 0.0000000 0.0000000
# [3,] 0.0000000 0.000000 0.1182109 0.0000000
# [4,] 0.0000000 0.000000 0.0000000 0.2587859
  • Column masses (diagonal matrix) - \(\mathbf{D_c}\):
ca.out <- biplot(HairEyeColor[,,2], center = FALSE) |> CA()
ca.out$Dc
#           [,1]      [,2]      [,3]       [,4]
# [1,] 0.3897764 0.0000000 0.0000000 0.00000000
# [2,] 0.0000000 0.3642173 0.0000000 0.00000000
# [3,] 0.0000000 0.0000000 0.1469649 0.00000000
# [4,] 0.0000000 0.0000000 0.0000000 0.09904153

CA calculations

  • Consider the independence model:

\[\chi^2 = \frac{(\text{Observed freq.}-\text{Expected freq.})^2}{\text{Expected freq.}}\]

  • Standardised Pearson residuals (\(\mathbf{S}\)):

\[ \mathbf{S} = \mathbf{D_r}^{-\frac{1}{2}}(\mathbf{P}-\mathbf{rc'})\mathbf{D_c}^{-\frac{1}{2}}\]

  • In terms of the weighted row and column masses (\(\mathbf{D_r}^{-\frac{1}{2}}\) and \(\mathbf{D_c}^{-\frac{1}{2}}\)).

  • The expected frequencies represented by the product of the row and column masses (\(\mathbf{rc'}\)).

  • Biplot coordinates: singular value decomposition of \(\mathbf{S}\).

\[ \text{svd}(\mathbf{S}) = \mathbf{U\Lambda V'}\]

biplotEZ variants

  • Variant refers to the contribution of the singular values (\(\Lambda\)) in the biplot solution.

    • Row principal coordinate biplot (default):

    \[\begin{aligned} \text{Row coordinates: } \hspace{0.5 cm}&\mathbf{U\Lambda}\\ \text{Column coordinates: }\hspace{0.5 cm}& \mathbf{V}\end{aligned}\]

    • Row standard coordinate biplot:

      \[\begin{aligned} \text{Row coordinates: } \hspace{0.5 cm}&\mathbf{U}\\ \text{Column coordinates: }\hspace{0.5 cm}& \mathbf{V\Lambda}\end{aligned}\]

    • Symmetric Correspondence Analysis map:

      \[\begin{aligned} \text{Row coordinates: } \hspace{0.5 cm}&\mathbf{U\Lambda^{\frac{1}{2}}}\\ \text{Column coordinates: }\hspace{0.5 cm}& \mathbf{V\Lambda^{\frac{1}{2}}}\end{aligned}\]

CA function

CA()
Argument Description
bp Object of class biplot.
dim.biplot Dimension of the biplot. Only values 1, 2 and 3 are accepted, with default 2.
e.vects Which eigenvectors (principal components) to extract, with default 1:dim.biplot.
variant Which correspondence analysis variant, with default "Princ".
lambda.scal TRUE or FALSE: Controls stretching or shrinking of column and row distances, with default FALSE.

Row principal coordinate biplot

biplot(HairEyeColor[,,2], center = FALSE) |> 
  CA() |> 
  plot()

Symmetric Correspondence Analysis map

biplot(HairEyeColor[,,2], center = FALSE) |> 
  CA(variant = "Symmetric") |> samples(col = c("darkmagenta", "forestgreen"), 
  cex = 1.7, opacity = 0.7) |> legend.type(samples = TRUE) |> plot()

Row standard coordinate biplot

biplot(HairEyeColor[,,2], center = FALSE) |> CA(variant = "Stand") |> 
  samples(col = c("darkmagenta", "forestgreen"), cex = 1.7, opacity = 0.7) |> 
  plot()
<<<<<<< Updated upstream

Lambda scaling

The inner product is invariant when \(\mathbf{A}\) and \(\mathbf{B}\) are scaled inversely by \(\lambda\), where \(\mathbf{A}\) represents the row coordinates and \(\mathbf{B}\) represents the column coordinates.

\[ \mathbf{AB} = (\lambda\mathbf{A})(\mathbf{B}/\lambda) \] An arbitrary value of \(\lambda\) can be selected or an optimal value could be used to ensure that the average squared distance of the points in \(\lambda\mathbf{A}\) and \(\mathbf{B}/\lambda\) is equal.

\[ \lambda^4 =\frac{r}{c}\frac{||\mathbf{B}||^2}{||\mathbf{A}||^2} \]

The default setting is to not apply lambda-scaling (i.e. lambda.scal = FALSE).

The return value is lambda.val.

Lambda scaling

=======

Enhancing the visualisation

>>>>>>> Stashed changes
par(mfrow=c(1,2))
biplot(HairEyeColor[,,2], center = FALSE) |> 
  CA(variant = "Stand", lambda.scal = FALSE) |>
  samples(col=c("palevioletred3","purple4"), cex = 1.7, opacity = 0.7) |> 
  plot()
biplot(HairEyeColor[,,2], center = FALSE) |> 
  CA(variant = "Stand", lambda.scal = TRUE) |>
  samples(col=c("palevioletred3","purple4"), cex = 1.7, opacity = 0.7) |> 
  plot()
<<<<<<< Updated upstream
======= >>>>>>> Stashed changes

New samples

biplot(HairEyeColor[,,2], center = FALSE) |> CA(variant = "Symmetric") |>  
  samples(pch = c(0,2), cex = 1.5) |> 
  interpolate(newdata = HairEyeColor[,,1]) |> 
  newsamples(col = c("orange","purple"), pch = c(15,17), cex = 1.7) |> 
  plot()    

Fit measures

ca.out <- biplot(HairEyeColor[,,2], center = FALSE) |> 
  CA(variant = "Symmetric") |> 
  fit.measures()
  • Quality:
ca.out$quality
# [1] 0.9833094
  • Adequacy:
ca.out$adequacy
#        Brown   Blue  Hazel  Green
# Dim 1 0.3824 0.5745 0.0425 0.0006
# Dim 2 0.5892 0.6277 0.3904 0.3926
# Dim 3 0.6102 0.6358 0.8530 0.9010
# Dim 4 1.0000 1.0000 1.0000 1.0000

Fit measures

ca.out <- biplot(HairEyeColor[,,2], center = FALSE) |> 
  CA(variant = "Symmetric") |> 
  fit.measures()
  • Row predictivities:
ca.out$row.predictivities
#        Black  Brown    Red  Blond
# Dim 1 0.6860 0.8630 0.4219 0.9974
# Dim 2 0.9932 0.9441 0.8508 0.9999
# Dim 3 1.0000 1.0000 1.0000 1.0000
# Dim 4 1.0000 1.0000 1.0000 1.0000
  • Column predictivities:
ca.out$col.predictivities
#        Brown   Blue  Hazel  Green
# Dim 1 0.9439 0.9898 0.4789 0.0113
# Dim 2 0.9990 0.9997 0.9020 0.8177
# Dim 3 1.0000 1.0000 1.0000 1.0000
# Dim 4 1.0000 1.0000 1.0000 1.0000

Canonical Variate Analysis (CVA) biplot

Aim: Dimension reduction technique that maximises variation between classes while minimising within class variation.

This is achieved by the following tasks:

  • Decomposing Variance
  • Find a linear mapping to canonical space.
  • Find a low dimensional approximation

Variance Decomposition

The classical variance decomposition \[\mathbf{T}=\mathbf{B}+\mathbf{W},\]

has as an analogy in this setting \[ \mathbf{X'X} = \mathbf{\bar{\mathbf{X}}'C \bar{\mathbf{X}}} + \mathbf{X' [I - G(G'G)^{-1}C(G'G)^{-1}G'] X}. \]

The choice of \(\mathbf{C}\) determines the variant of CVA:

  • Weighted: \(\mathbf{C}=\mathbf{N}=\mathbf{G'G}\)
  • Unweighted: \(\mathbf{C}=\mathbf{I}_G - G^{-1}\mathbf{1}_G\mathbf{1}_G'\)
  • Unweighted with weighted centroid: \(\mathbf{C}=\mathbf{I}_G\)

Linear Mapping

Find a linear mapping

\[\mathbf{Y}=\mathbf{X}\mathbf{M}, \tag{1}\]

such that \[\frac{\mathbf{m}'\mathbf{B}\mathbf{m}}{\mathbf{m}'\mathbf{W}\mathbf{m}} \tag{2}\] is maximised s.t. \(\mathbf{m}'\mathbf{W}\mathbf{m}=1\).

It can be shown that this leads to the following equivalent eigen equations:

\[ \mathbf{W}^{-1}\mathbf{BM} = \mathbf{M \Lambda} \tag{3} \]

\[ \mathbf{BM} = \mathbf{WM \Lambda} \tag{4} \]

\[ (\mathbf{W}^{-\frac{1}{2}} \mathbf{B} \mathbf{W}^{-\frac{1}{2}}) \mathbf{M} = (\mathbf{W}^{-\frac{1}{2}} \mathbf{M}) \mathbf{\Lambda} \tag{5} \]

with \(\mathbf{M'BM}= \mathbf{\Lambda}\) and \(\mathbf{M'WM}= \mathbf{I}\).

Since the matrix \(\mathbf{W}^{-\frac{1}{2}} \mathbf{B} \mathbf{W}^{-\frac{1}{2}}\) is symmetric and positive semi-definite the eigenvalues in the matrix \(\mathbf{\Lambda}\) are positive and ordered. The rank of \(\mathbf{B} = min(p, G-1)\) so that only the first \(rank(\mathbf{B})\) eigenvalues are non-zero. We form the canonical variates with the transformation

\[ \bar{\mathbf{Y}} = \bar{\mathbf{X}}\mathbf{M}.\tag{6} \]

Low dimensional approximation

The first two canonical variates are given by:

\[\mathbf{\bar{Z}}=\mathbf{\bar{Y}}\mathbf{J}_2=\mathbf{\bar{X}}\mathbf{M}\mathbf{J}_2 \tag{7}\] where \(\mathbf{J'}_2=[\mathbf{I}_2 \quad \mathbf{0}]\). We add the individual sample points with the same transformation \[\mathbf{Z}=\mathbf{X}\mathbf{M}\mathbf{J}_2. \tag{8}\]

A new sample point, \(\mathbf{x}^*\), can be added by interpolation \[\mathbf{z}^*=\mathbf{x}^*\mathbf{M}\mathbf{J}_2.\tag{9}\]

CVA function

CVA()
Argument Description
bp Object of class biplot.
classes Vector of class membership. User specified, otherwise defaults to vector specified in biplot.
dim.biplot Dimension of the biplot. Only values 1, 2 and 3 are accepted, with default 2.
e.vects Which eigenvectors (principal components) to extract, with default 1:dim.biplot.
weightedCVA “weighted” or “unweightedCent” or “unweightedI”: Controls which type of CVA to perform, with default "weighted"
show.class.means TRUE or FALSE: Controls whether class means are plotted, with default TRUE.
low.dim "sample.opt" or "Bhattacharyya.dist": Controls method of constructing additional dimension(s) if dim.biplot is greater than the number of classes, with default "sample.opt".

Class means

The means() function allows the user to make adjustments to the points representing the class means.

Plotting only a selection of the class means

Argument Description
which a vector containing the groups or classes for which the means should be displayed, with default bp$g.
biplot(state.x77) |> CVA(state.region) |> means(which = c(2,3), 
  label = TRUE) |> plot()

Class means

Colours and characters for the points

The following arguments control the aesthetic options for the plotted class mean points:

Argument Description
col the colour(s) for the means, with default as the colour of the samples.
pch the plotting character(s) for the means, with default 15.
cex the character expansion(s) for the means, with default 1.
opacity transparency of means.
shade.darker a logical value indicating whether the colour of the mean points should be made a shade darker than the default or specified colour, with default TRUE.

Class means

Colours and characters for the points

biplot(state.x77) |> CVA(state.region) |> means(cex = c(1,2,3,4), 
    col = "red", pch = c(9,13,4,16)) |> plot()

Class means

Labels

The following arguments control the aesthetic options for the labels accompanying the plotted class mean points:

Argument Description
label a logical value indicating whether the means should be labelled, with default TRUE.
label.col a vector of the same length as which with label colours for the means, with default as the colour of the means.
label.cex a vector of the same length as which with label text expansions for the means, with default 0.75.
label.side the side at which the label of the plotted mean point appears, with default bottom.
label.offset the offset of the label from the plotted mean point.

Class means

Labels

biplot(state.x77) |> CVA(state.region) |> means(label = TRUE, 
  label.side = "top", label.offset = 2, label.cex = 1) |> plot()

Classification regions

This function creates classification regions for the CVA biplot.

The classify() function appends the biplot object with the following elements:

  • A confusion matrix from the classification into classes

  • The classification accuracy rate

  • A logical value indicating whether classification regions are shown in the biplot

  • A list of chosen aesthetics for the classification regions

  • The midpoints of the classification regions

Classification regions

Classification regions in the CVA biplot

biplot(state.x77) |> 
  CVA(state.region) |> classify(col=c("cornflowerblue","darkolivegreen3",
  "darkgoldenrod","aquamarine")) |> plot()

\(\alpha\)-bags containing a percentage of observations

This function creates \(\alpha\)-bags

The alpha.bags() function appends the biplot object with the following elements:

  • A list of coordinates for the \(\alpha\)-bags for each group

  • A vector of colours for the \(\alpha\)-bags

  • A vector of line types for the \(\alpha\)-bags

  • A vector of line widths for the \(\alpha\)-bags

\(\alpha\)-bags containing a percentage of observations

The \(\alpha\)-bags in the CVA biplot

biplot(state.x77) |> CVA(state.region) |> 
  alpha.bags(alpha = c(0.85,0.9,0.95,0.99), lty = c(1,2,3,4)) |> plot()
# Computing 0.85 -bag for Northeast 
# Computing 0.9 -bag for South 
# Computing 0.95 -bag for North Central 
# Computing 0.99 -bag for West

Concentration ellipses

This function creates \(\kappa\)-ellipses

The ellipses() function appends the biplot object with the following elements:

  • A list of coordinates for the \(\kappa\)-ellipses for each group

  • A vector of colours for the \(\kappa\)-ellipses

  • A vector of line types for the \(\kappa\)-ellipses

  • A vector of line widths for the \(\kappa\)-ellipses

  • A vector of \(\alpha\) values

Concentration ellipses

Concentration ellipses in the CVA biplot

biplot(state.x77) |> CVA(state.region) |> 
  ellipses(alpha = c(0.85,0.9,0.95,0.99), lwd = c(1,2,3,4)) |> plot()
# Computing 1.95 -ellipse for Northeast 
# Computing 2.15 -ellipse for South 
# Computing 2.45 -ellipse for North Central 
# Computing 3.03 -ellipse for West

Measures of fit

bp <- biplot(state.x77) |> CVA(classes = state.region) |> fit.measures()

Contains the following information on how well the biplot represents the information of the original and canonical space:

  • quality: Quality of fit for canonical and original variables
  • adequacy: Adequacy of original variables
  • axis.predictivity: Axis predictivity
  • within.class.axis.predictivity: Class predictivity
  • within.class.sample.predictivity: Sample predictivity

Summary of measures of fit

The summary() function prints to screen the fit.measures stored in the object of class biplot.

bp |> summary()
# Object of class biplot, based on 50 samples and 8 variables.
# 8 numeric variables.
# 4 classes: Northeast South North Central West 
# 
# Quality of fit of canonical variables in 2 dimension(s) = 91.9% 
# Quality of fit of original variables in 2 dimension(s) = 93.4% 
# Adequacy of variables in 2 dimension(s):
#  Population      Income  Illiteracy    Life Exp      Murder     HS Grad 
# 0.453533269 0.105327455 0.107221535 0.002201286 0.208653101 0.687840023 
#       Frost        Area 
# 0.452308013 0.118544323 
# Axis predictivity in 2 dimension(s):
# Population     Income Illiteracy   Life Exp     Murder    HS Grad      Frost 
#  0.9873763  0.9848608  0.8757913  0.9050208  0.9955088  0.9970346  0.9558192 
#       Area 
#  0.9344651 
# Class predictivity in 2 dimension(s):
#     Northeast         South North Central          West 
#     0.8031465     0.9985089     0.6449906     0.9988469 
# Within class axis predictivity in 2 dimension(s):
# Population     Income Illiteracy   Life Exp     Murder    HS Grad      Frost 
# 0.02246821 0.10349948 0.27870637 0.21460313 0.29836047 0.87510975 0.22320989 
#       Area 
# 0.13603927 
# Within class sample predictivity in 2 dimension(s):
#        Alabama         Alaska        Arizona       Arkansas     California 
#    0.769417280    0.174566384    0.328610375    0.148035077    0.103141908 
#       Colorado    Connecticut       Delaware        Florida        Georgia 
#    0.357627854    0.079176621    0.438089663    0.327270922    0.558038750 
#         Hawaii          Idaho       Illinois        Indiana           Iowa 
#    0.029173037    0.167543892    0.076948041    0.473148418    0.592667777 
#         Kansas       Kentucky      Louisiana          Maine       Maryland 
#    0.774719240    0.439306768    0.190654770    0.086183357    0.284829878 
#  Massachusetts       Michigan      Minnesota    Mississippi       Missouri 
#    0.428103056    0.188094295    0.644844800    0.163103449    0.719255739 
#        Montana       Nebraska         Nevada  New Hampshire     New Jersey 
#    0.239142302    0.671350698    0.015766988    0.386053551    0.207503850 
#     New Mexico       New York North Carolina   North Dakota           Ohio 
#    0.012872885    0.008101305    0.872322617    0.457852394    0.092634247 
#       Oklahoma         Oregon   Pennsylvania   Rhode Island South Carolina 
#    0.561156131    0.158926944    0.261838286    0.482912999    0.229047767 
#   South Dakota      Tennessee          Texas           Utah        Vermont 
#    0.095865021    0.237667483    0.121494852    0.349495632    0.256983459 
#       Virginia     Washington  West Virginia      Wisconsin        Wyoming 
#    0.453608981    0.044780371    0.346223950    0.544998639    0.174849092

Rotation

The rotate() function rotates the samples and axes in the biplot by rotate.degrees degrees.

par(mfrow=c(1,2))
bp |> plot()
bp |> rotate(rotate.degrees=90)|> plot()

Reflection

The reflect() function reflects the samples and axes in the biplot along an axis, x(horisontal reflection), y (vertical reflection) or xy (diagonal reflection).

par(mfrow=c(1,2))
bp |> plot()
bp |> reflect(reflect.axis ="y")|> plot()

The argument zoom=TRUE in plot()

The argument zoom= is FALSE by default. If zoom=TRUE a new graphical device is launched. The user is prompted to click on the desired upper left hand and lower right hand corners of the zoomed in plot.

bp |>  plot(zoom=TRUE)

Using the zoom argument

1 Dimensional biplot CVA of state.x77 data

biplot(state.x77,classes=state.region) |> CVA(dim.biplot=1) |> 
  classify() |> plot()

1 Dimensional biplot PCA of iris data

biplot(iris, group.aes = iris$Species) |> PCA(dim.biplot = 1) |> 
  density1D() |> ellipses() |> plot()
# Computing 1.96 -ellipse for setosa 
# Computing 1.96 -ellipse for versicolor 
# Computing 1.96 -ellipse for virginica

3 Dimensional biplots

The dim.biplot argument can be set to 3 to allow the user to create a 3D biplot. The plot() function makes use of the RGL device for the 3D display.

3D PCA biplot of the iris data

biplot(iris) |> PCA(group.aes = iris[,5], dim.biplot = 3)|> plot()

3D PCA biplot

3 Dimensional biplots

3D biplot of the state.x77 data with class means

biplot(state.x77) |> CVA(classes = state.region,dim.biplot = 3) |> 
  means(col = "red", cex = 5) |> plot()

3D CVA biplot

More biplots

  • CVA biplots for two classes
  • Regression biplot
  • Spline biplot

CVA biplots for two classes

Canonical space of dimension 1.

Solve \(\mathbf{BM=WM\Lambda}\) where \(\mathbf{M} = \begin{bmatrix} \mathbf{m}_1 & \mathbf{M}_2\\ \end{bmatrix}\)

\[ \bar{\mathbf{Y}} = \bar{\mathbf{X}} \mathbf{M} = \begin{bmatrix} \bar{y}_{11} & 0 & \dots & 0 \\ \vdots & \vdots & & \vdots\\ \bar{y}_{K1} & 0 & \dots & 0 \\ \end{bmatrix} \]

\[ \mathbf{\Lambda} = diag(\lambda, 0, ..., 0) \] Total squared reconstruction error for means: \(TSREM = tr\{ (\bar{\mathbf{X}}-\hat{\bar{\mathbf{X}}})(\bar{\mathbf{X}}-\hat{\bar{\mathbf{X}}})'\} = 0\)

Total squared reconstruction error for samples: \(TSRES = tr\{ ({\mathbf{X}}-\hat{{\mathbf{X}}})({\mathbf{X}}-\hat{{\mathbf{X}}})'\} >0\)

Minimise \(TSRES\) (Default option)

Alternative option: Maximise Bhattacharyya distance. For more details see

  • le Roux, N. and Gardner-Lubbe, S., 2024. A two-group canonical variate analysis biplot for an optimal display of both means and cases. Advances in Data Analysis and Classification, pp.1-28.

CVA biplots for two classes

Solve \(\mathbf{BM=WM\Lambda}\) where \(\mathbf{M} = \begin{bmatrix} \mathbf{m}_1 & \mathbf{M}_2\\ \end{bmatrix}\)

Minimise \(TSRES\)

\[ \mathbf{M}^{-1} = \begin{bmatrix} \mathbf{m}^{(1)} \\ \mathbf{M}^{(2)}\\ \end{bmatrix} \]

\[ \mathbf{M}^{(2)}\mathbf{M}^{(2)'} = \mathbf{UDV}' \]

\[ \mathbf{M}_{opt} = \begin{bmatrix} \mathbf{m}_1 & \mathbf{M}_2\mathbf{V}\\ \end{bmatrix} \]

CVA biplots for two classes

biplot (iris[51:150,]) |> CVA (classes = iris[51:150,5]) |> means (cex = 2) |>
  axes (label.dir = "Hor", label.line = c(0.8, 0, 0, 0)) |> plot ()

Regression biplot

Any 2D representation of sample points, for example

library (MASS)
Zmat <- sammon (dist(iris[-102,1:4], method = "manhattan"))$points
# Initial stress        : 0.01116
# stress after  10 iters: 0.00833, magic = 0.018
# stress after  20 iters: 0.00614, magic = 0.213
# stress after  30 iters: 0.00561, magic = 0.500
# stress after  40 iters: 0.00558, magic = 0.500

To create a biplot we need to add information on the variables.

\[ \mathbf{X}:n \times p \]

\[ \mathbf{Z}:n \times 2 \]

\[ \mathbf{X = ZB + E} \]

\[ \mathbf{B = (X'X)}^{-1}\mathbf{X'Z} \]

Regression biplot

biplot (iris[-102,]) |> regress (Zmat) |>  plot ()

Spline biplot

Are linear axes a good representation when the transformation from \(\mathbf{X}:n \times p\) to \(\mathbf{Z}:n \times 2\) is nonlinear?

Replace linear regression with splines.

Spline biplot

biplot (iris[-102,1:4]) |> regress (Zmat, axes = "splines") |>  plot ()
# Calculating spline axis for variable 1 
# Calculating spline axis for variable 2 
# Calculating spline axis for variable 3 
# Calculating spline axis for variable 4
<<<<<<< Updated upstream
======= >>>>>>> Stashed changes

In conclusion

Biplots developed since 1971.

R code (SPLUS) developed since 1997.

The biplotEZ package encapsulates the developments in a single package.

Developments on biplotEZ are continuing

  • Categorical variables: Multiple correspondence analysis & Categorical PCA.
  • Shiny / plotly version for interactivity.
  • Modelling: bi-additive biplots, biplots for logistic regression.

Applying biplotEZ an R package for multi-dimensional data visualisation

Contact one of the team members or e-mail muvisu@sun.ac.za.